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Abstract. We establish a relationship between the word complex- 
ity and the number of generalized diagonals for a polygonal billiard. 
We conclude that in the rational case the complexity function has 
cubic upper and lower bounds. In the tiling case the complexity has 
cubic asymptotic growth. 



1. Introduction 

A billiard ball, i.e. a point mass, moves inside a polygon Q C with 
unit speed along a straight line until it reaches the boundary dQ, then 
instantaneously changes direction according to the mirror law: "the angle 
of incidence is equal to the angle of reflection," and continues along the 
new line. 

How complex is the game of billiards in a polygon? The first results in 
this direction, proven independently by Sinai and Boldrighini, Keane 
and Marchetti |[BKM|| is that the metric entropy with respect to the in- 



variant phase volume is zero. Sinai's proof in fact shows more, the "metric 
complexity" grows at most polynomially. Furthermore, it is known that 
the topological entropy (in various senses) is zero |]^, |GK'l] , |GuH . 



To prove finer results there are two natural quantities one can count, 
one is the number of generalized diagonals, that is (oriented) orbit seg- 
ments which begin and end in a vertex of the polygon and contain no 
vertex of the polygon in their interior. The number of links of a general- 
ized diagonal is called its combinatorial length while its geometric length 
is simply the sum of the lengths of the segments. Let Ng{t) (resp. Nc{n)) 
be the number of generalized diagonals of geometric (resp. combinato- 
rial) length at most t (resp. n). Katok has shown that Ng{t) grows slower 
than any exponential Masur has shown that for rational polygons, 
that is for polygons all of whose inner angles are commensurable with vr. 



Ng{t) has quadratic upper and lower bounds ||M1| , |M2|| . By elementary 
reasoning there is a constant B > 1 such that < Nc{t)/Ng{t) < B, 
thus all of these results easily extend to the quantity Nc{n). Furthermore, 
Veech has shown that there is a special class of polygons now commonly 
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referred to as Veech polygons, for example regular polygons, such that 
the quantity Ng{t)/t^ admits a limit as t tends to infinity [0, |V1|| . 



To introduce the second natural quantity which can be counted, label 
the sides of Q by symbols from a finite alphabet A whose cardinality is 
equal to the number of sides of Q. We code the orbit by the sequence of 
sides it hits. Consider the set C{n) of all words of length n which arise 
via this coding. Let p{n) = ^C{n), this is called the complexity function 
of the language C{-). The only general results known about the com- 
plexity function is that it grows slower than any exponential |K] and at 
least quadratically . For billiards in a square the complexity function 
has been explicitly calculated, albeit for a slightly different coding (the 
alphabet consists of two symbols, one for vertical sides one for horizontal 



sides) I [Mil , |BP|| . For this coding of the square the collection of codes 
which appear are known as the Sturmian sequences. In fact it is not 
hard to relate the complexity functions for the two different codings, the 
relationship is p4{n) = 4:p2{n) —4. There are some related results on the 
complexity when one restricts to certain initial conditions: for rational 
polygons the "directional complexity" in each direction is known explic- 
itly ||H1|| , while for general polygons there are polynomial upper bounds 



for the directional complexity ||GuT]| . 



There are several good surveys of billiards in polygons, in these sur- 
veys one can find more details about the definitions and more precise 
statements of the above mentioned results. We refer the reader to ||Gul 



Our main theorem shows that p{n) and Nc{n) are related. 
Theorem 1.1. For any convex polygon 

n-l 

p{n) = Y,NcU). 

j=0 

Here we remark that A''c(0) is the number of vertices of the polygon 
while the sides of Q are not counted as generalized diagonals. Apply- 
ing the above mentioned results of Masur's [[Ml| , |M2|| we immediately 
conclude 

Corollary 1.2. If Q is a rational convex polygon then there are positive 
constants Di,D2 such that 

Di < p{n)/n^ < D2 

for all n G N\{0}. 

Next we exhibit several examples where there are exact asymptotics. 
We show 
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Theorem 1.3. IfQ is the square, the isosceles right triangle or the equi- 
lateral triangle then 



n— >oo Tl"^ 



(1) 



exists. The following table expresses the limit. 
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Square 




V2' 4' iJ'^'^*'^^^ 


37r2 


Equilateral triangle 


47r2 



The proof of Theorem is split into two parts. The first part is 
combinatorial. It uses the notion of bispecial words which was developed 
by Cassaigne 0. The second part is geometric and uses a counting 
argument based on Euler's formula. 

Remark: it is known that for n sufficiently large the complexity of 
each aperiodic individual word is 4(n + 1) for the square, 3(n + 2) for the 
equilateral triangle, 4(n+2) for the isosceles right triangle and 6(n+2) for 
the half equilateral triangle [Hl| , p2|| . For the square the complexity 
is four times larger than that of Sturmian sequences, thus the fact that 
p{n) is asymptotically four times the number of Sturmian words of length 
n is not surprising |BP|| . 

Any infinite word of eventual complexity 3{n + 2) whose language 
(i.e. the collection of finite factors) is invariant under cyclic permuta- 
tions of the letters arises as the coding of a billiard trajectory in the 
equilateral triangle f^. The third entry of the table in Theorem |1.3| 
gives the asymptotic growth rates of the number of all such words. 
Two interesting tiling cases remaining to evaluate the limit (|I|) are the 

— , — , — ) -triangle and the hexagon. In this triangular case the methods 
2 3 6/ 

developed for the other three cases allow us to conclude that this limit 
exists and to calculate it explicitly. Since the combinatorics of this case 
is more complicated than the others, we leave its explicit computation to 
the dedicated reader. 

The hexagonal case remains open since the corresponding lattice point 
counting problem seems not to have been investigated. 

It would be interesting to know if the limit (|I]) exists in the case of 
Veech polygons and also to exhibit cases when it does not exist. 
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2. A COMBINATORIAL LEMMA 

Let p{0) := and for any n > 1 let s{n) := p{n + 1) — p{n). For 
u & C{n) let 

mi{u) ij^{a ^ A : au e C{n + I)} 
mr{u) := 4j^{h e A : uh e C{n + I)} 
mi,{u) := #{(a, b) e A"^ : aub e C{n + 2)}. 

We remark that all three of these quantities are larger than or equal to 
one. A word u e C{n) is called left special if mi{u) > 1, right special if 
mr{u) > 1 and bispccial if it is left and right special. Let BC{n) :— {u £ 
C{n) : u is bispecial}. In this section we show that 

Theorem 2.1. For any polygon Q 

+ ^ (^mb{v) - 'mi{v) - m.r{v) + ij . 

veBC{n) 



Remark: there is no assumption of convexity for this theorem, in fact 
it is not necessary that the language arises from the coding of a polygonal 
billiard. 

Proof. Since for every u e jC{n + 1) there exists b E A and v e jC{n) 
such that u — vbwe have 

*H = XI ('^r{u) - 1). 
tis£(n) 

Thus 

s{n + 1) - s{n) = ^ (rririv) - l) - XI (mr{u) - ij . 

veC{n+l) ueC{n) 

For u e C{n + 1) we can write u — av where a & A and v e >C(n), thus 



s(n+l)-s{n)^ }^ 

veCin) 



(mr{av) - l) - (rnr{v) - 



ave£(ra+l) 



For any word v G i3(n) and av G £(n + 1) any legal prolongation 
to the right of av is a legal prolongation to the right of v as well thus 
if rririv) — 1 then mr{av) = 1. Thus words with mr{v) = 1 do not 
contribute to the above sum. Thus s{n + 1) — s(n) is equal to the above 
sum restricted to those v such that mr{v) > 1. If furthermore mi{v) = 1 
then there is only a single a such that av G C{n + 1). For this a we have 
mr{av) = mr{v) thus such words do not contribute to the sum either. 
Thus we can restrict the sum to bispecial words, yielding 
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Figure 1 . A generalized diagonal of combinatorial length 
4 with code bed. 



s{n + 1) — s{n) = 

veBC{n) 



(^mr{av) — 1 j — (^mr{v) 



The lemma follows since for any v G BC{n) we have 

mh{v) = mr{av) 

and 



□ 



3. Proof of theorem [TTT] 

Let X := {(s, f ) : s G (9Q and v is an inner pointing unit vector} and 
P the "partition" of X induced by the sides of Q. The ambiguity of 
the definition of P at the vertices plays no role in our discussion. Let 
T : X ^ X he the billiard ball map. An element of the partition 
P V T~^P V ■ ■ ■ V T~'^~^^P is called an n-cell. The code of every point 
in an n-cell has the same prefix of length n, thus there is a bijection 
between the set of n-cells and the language £(n). 

If the footpoint of T"x is a vertex then we say that x belongs to a 
discontinuity of order n. A discontinuity (of any order) is locally a curve 
whose endpoints lie on the boundary of X or on a discontinuity of lower 
order. We call each piece between such endpoints a smooth branch of 
the discontinuity. 
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For V G BC{n) let gd{v) be the number of generalized diagonals of 
length n+1 such that the code of (the nonsingular part of) the generalized 
diagonal is v (see figure 1). Let Ii{v) := mi{v) — 1 and Ir{v) '■= rririv) — 1. 
For short we call {Ii{v), Ir{v), gd{v)) the index of v. 

Lemma 3.1. Suppose that Q is a convex polygon. For any v G BC{n) 

nibiv) = Ii{v) + Ir{v) + gd{v) + 1 

Proof. We consider the n-cell C with bispecial code v. Note that an 
n-cell is a convex polygon |K|, thus geometrically the number mb{v) 
corresponds to the number of pieces C is cut into by the discontinuities 
of order —1 and n. 

Let r be the number of sides of Q. There are Ii{v) < r vertices of Q 
which produce the splitting on the left, they cut C via singularities of 
T^^. Similarly there are Ir{v) < r vertices which produce the splitting 
on the right, they correspond to cutting C via singularities of T". 

Suppose the index of v is k). The cell C is cut by i+j singularities 
with k intersections inside the interior of C. We claim that since Q is 
convex, each of these k intersections consists of an intersection of exactly 
two smooth branches of the singularities. Consider an intersection point 
X. Its forward orbit arrives at a vertex in say m > steps and ends. Thus 
X belongs to the interior of a discontinuity of order m. The forward orbit 
hits no other vertex before time m, and by definition ends at time m, 
thus X belongs to the interior of no other discontinuity of positive order. 
There are two possible continuations by continuity of the orbit of x. If 
either of these continuations is a generalized diagonal or tangent to a side 
of Q then x is an end point of another singularity of positive order. In the 
second case the order of this additional singularity is also m, while in the 
first case it is strictly larger than m. We note that the second possibility 
can only happen if Q is not convex. Similarly, considering the backwards 
orbit we see that x belongs to the interior of a single discontinuity of 
negative order. If Q is not convex then it is not the end point of any 
negative discontinuity of greater order. The claim is proven. 

Next we will use Euler's formula to conclude our lemma. Let F, E, V 
stand for the number of faces, edges and vertices respectively of the parti- 
tion of the interior of C by the discontinuities of order —1 and n. We have 
E := i + j + 2k and V := k. By Euler's formula we have V — E + F = 1 
thus F=l — V + E = l + i+ j + k. As discussed above mf,{v) = F. 

□ 



Proof of Theorem |1 . 1| . The theorem follows immediately from lemma 
3l1 and Theorem since Nc{n) = Y^jZo HvaBCU) 9d{v). □ 
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Figure 2. Counting the generalized diagonals starting at 
the origin of combinatorial length at most 3 in the square. 



4. Proof of Theorem 1.3 



It is well known that if the images of Q under the action A{Q) tile 
the plane, then Q is the square, the equilateral triangle, the right isosce- 
les triangle or the half equilateral triangle (i.e. the triangle with angles 
(7r/2, 7r/3, 7r/6)). We will use this tiling to calculate iVc(n). 



4.1. The square. The tiling is the usual square grid. Fix a corner of 
the square and call it the origin of the grid. Consider all the generalized 
diagonals in the grid of combinatorial length at most n which start from 
this corner and are in the first quadrant. From figure 2 it is clear that 
the number Mc(n) of such generalized diagonals is 

# {{i,]) G : ^ + J < n + 1 and {1,3) = l} 

where {i,j) is the gcd of i and j. The condition {i,j) = 1 arises since 
generalized diagonals stop as soon as they hit a vertex, thus if a line 
through the origin hits several vertices (for example the line y = x), 
it corresponds to only one generalized diagonal (starting at the origin). 
Thus we only count it once. Since there are four possible starting corners 
we have Nc{n) = 4Mc{n). 

The asymptotics of this quantity is well known by Mertens formula 
and its generalizations ||11W| , |N]]: 
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Figure 3. The affinely transformed grid of the equilateral triangle. 



Applying Theorem 1.1 we have 



n 



12 ^ 

k=l 



pin) ^ > > ~ . 



4.2. The equilateral triangle. We consider the images of Q under the 
action of A{G) specifying that one of the vertices is at the origin and 
another at the point (1,0). We transform this grid to the grid in figure 
3 via the affine mapping which fixes the vector and takes the vector 

/cos(7r/3)\ ^„ ,^0\ 

Consider all the generalized diagonals of combinatorial length at most 
n which start from the origin and are in the first quadrant. Let Mc{n) 
be the cardinality of this set. Since there are 3 vertices we have Nc{n) = 
3Mc(n). From figure 3 one sees that Mc{2n) = Mc{2n + 1) since in 
traversing a square in the tiling one alway crosses exactly two consecutive 
copies of the fundamental triangle. From the figure it is also clear that 
M^{2n) = e : z + j < n + 1 and = 1}. 

By Mertens formula PTW| , INJ: 




Figure 4. The grid of the right isosceles triangle. 



Applying Theorem |Ll] we have 

3 

4.3. The right isosceles triangle. There are two different quantities 
which we must count. First we consider all the generalized diagonals of 
combinatorial length at most n which start from the origin of the grid in 
figure 4a and are in the first quadrant. Let Mi(n) be the cardinality of 
this set. We also consider all the generalized diagonals of combinatorial 
length at most n which start from the origin of the grid in figure 4b and 
are in the first octant. Let M2{n) be the cardinality of this set. There are 
two vertices of our triangle with angle 7r/4 thus Nc{n) = Mi{n) +2M2{n) . 

The number 2M2(n) can also be interpreted as the cardinality of the set 
of all the generalized diagonals of combinatorial length at most n which 
start from the origin of the grid in figure 4b and are in the first quadrant. 
With this interpretation, if we overlay the grids from figures 4a and 4b we 
see that each generalized diagonal counted in Mi{n) is also a generalized 
diagonal counted in 2M2(n + 1) and each generalized diagonal counted 
in 2M2(n) is counted in Mi(n + 1). Thus the asymptotics of Nn{c) is the 
same as the asymptotics of 2Mi(n). 

We want to count Mi{n). All generalized diagonals in the argument 
below will start at the origin of the grid pictured in figure 4a and all 
lengths will be combinatorial lengths. We count first the solid lines which 
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the generalized diagonal crosses, we will deal later with the dashed lines. 
For most of the argument it will not matter whether the generalized 
diagonal is simple or not (i.e. contains no vertices in its interior), we will 
restrict to the set of simple generalized diagonals only in the last step of 
the proof. 

Let be the true combinatorial length of the generalized diagonal 

starting at the origin with end point for any G N^. We view 
this length as the sum of the solid lines and the dashed lines it crosses 
plus one. If i > j then the number of dashed lines it crosses is [^J . 

On the other hand the number of solid lines it crosses is characterized 
by the following statements. Suppose that n = 3k, then if it crosses 
n — 1 solid lines then i + j = 2n/3 + l = 2k + 1. Inversely, supposing 
that i + j = 2k + 1, then it crosses n — 1 = 3A; — 1 solid lines. 

Combining these two facts we have if 

{i, j) G and i > j and i + j = 2k + 1 (2) 

then 

' ~^ 
2 

We need to calculate the region 7l{n) consisting of all G 
such that i > j and < n. To do this fix as in (|^) and a 

natural number m < i — 1. We compare how many fewer dashed lines 
are crossed by the generalized diagonal ending at {i — m,j) than by the 
generalized diagonal ending at This comparison yields /(i — m,j) = 

n + |_^J — 2m + Em where Em = ^ mod 2. Thus l{i — m,j) < n if and 
only if n + [^J ~ 2m + Em ^ n. A simple computation yields the 
following two implications 

i — j 

m> — ^ — =^ l[t — m,j)<n 
% — J 1 

m < =^ l(i — m, j) > n + 1 

Let mo := min{m : l{i — m,j) < n}. From the above implications we 
have mo < ^ + 1 and mo > |- Let V{n) be the line x = —y/2+n/2. 
This line is the "ideal boundary" of the region TZ{n). The following 
computation shows that the distance of the true boundary from the ideal 
boundary is uniformly bounded: 



t -\ mo 

2 2 ° 



d(^{i-mo,j),V{n)^ < d (^{i ~ mo, j), (-^ + ^, j)^ 

Let A"'"(n) be the triangle whose boundaries are the x-axis, the line 
y = X and the line V{n). By symmetry we also define a region A~(n) in 
the second octant (i.e. we consider i < j). Let Mi(n) be the number of 
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< -. 

- 4 
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simple generalized diagonals starting at the origin whose other end is in 
the region A{n) := A"^(n) U A~{n). Since the distance of the set A{n) 
from the set TZ is uniformly bounded (in n) the asymptotics of Mi{n) 
and Mi{n) are the same. By symmetry Mi{n) is twice the number of 
relatively prime lattice points in the region A~^{n). 

The area of A'^{n) is n^/12. Thus applying Mertens [[HW| , ^ yields 

MAn) ~ MAn) ~ 2 x 



27r2' 
2n2 



7r2 



Thus 

Nc{n) = 2Mi(n) ~ 
and applying Theorem gives 

r ^ 23 

4.4. The half equilateral triangle. The procedure is along the same 
lines as the previous examples. We consider the affinely transformed grid 
similarly to the case of the equilateral triangle. Counting the generalized 
diagonals which start at the origin reduces to an application of Merten's 
formula. The explicit description of the region to which Merten's formula 
must be applied is more complicated than in the previous examples, thus 
we do not carry it out. 

Acknowledgements: We would like to thank Samuel Lelievre a crit- 
ical reading of an earlier version of this article. 
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